MSBA6461 Individual Project
ΒΆ

Amy (Nai Wen) Chiu

Routing Optimization
ΒΆ

Table of ContentΒΆ

  1. Overview
    • Project Objectives
    • Techniques
  2. Common Routing Optimization Problems and Solutions
    • Shortest Distance or Traveling Time
    • Vehicle routing problem
    • Traveling Salesman Problem
    • Traveling Purchaser Problem
  3. Dataset
  4. Examples of Solving Each Problem
    • Import Packages
    • Preprocessing
    • Shortest Path Problem
      • Creating Road Map
      • Calculating Shortest Path
    • Traveling Salesman Problem
      • Creating Distance Matrix
      • Calculating TSP Optimal Path
    • Vehicle Routing Problem
      • Calculating VRP Optimal Path
  5. Reference

Overview ΒΆ

Project Objectives ΒΆ

Route optimization helps in improving the efficiency of operations. With streamlined routes, drivers can complete more deliveries or service calls in less time. This increased productivity can lead to higher customer satisfaction and profitability. Route optimization also allows for better use of resources: it can help in effectively managing the fleet and reducing the need for additional vehicles or drivers. To meet this demand, many companies offer optimization solutions through apps and APIs that seamlessly integrate with business operations.

As a former data analyst intern in an e-commerce platform's logistics division, I developed a keen interest in implementing ML model in this field. I aim to delve into these mechanisms and try to solve some common problems with small set of data.

Techniques ΒΆ

  • Dijkstra's algorithm: This algorithm is used to determine the shortest paths between nodes within a weighted graph. It's particularly effective for optimizing routing and navigation problems by identifying the most efficient routes between points.
  • OR-Tools: OR-Tools is a comprehensive optimization package developed by Google. It encompasses a range of functions for setting up optimization environments, defining business constraints, and implementing various optimization strategies. It's a versatile tool used in fields like logistics, scheduling, and resource allocation to find optimal solutions to complex problems.
  • Map visualization package: Folium is a powerful tool for visualizing geospatial data on interactive maps. It provides a user-friendly interface to create maps with custom markers, overlays, and layers, making it ideal for displaying geographical information and analyzing spatial relationships.
  • Map information package: OSMnx is a package designed for retrieving, constructing, and analyzing street networks and other spatial data from OpenStreetMap. It's particularly useful for extracting detailed map information, such as street networks, building footprints, and elevation data, which can be leveraged for various geographical analyses and visualizations.

Common Routing Optimization Problems and Solutions ΒΆ

These represent several typical routing optimization challenges. Variations stem from these fundamentals but entail different and more intricate objectives.

Shortest Distance or Traveling Time ΒΆ

The most straightforward problem is to find the shortest route from point A to point B, with the objective function based on either geographical distance or travel time. This scenario does not include additional obstacles or requirements, such as passing through point C or starting or ending at a specific node.

Dijkstra's algorithm is commonly used to solve this problem. It works by selecting the unvisited vertex with the lowest distance, then calculating the distance through it to each unvisited neighbor and updating the neighbor's distance if it's smaller. This process continues until the shortest path from A to B is determined.

The algorithm and scenario can be implemented under the assumption that each location is connected by a direct road, disregarding red lights, stop signs, toll roads, and other obstructions. This simplification allows for a straightforward calculation of the shortest route based solely on geographical distance or travel time.

Here's an example of how the algorithm works in practice. Let's say we want to determine the shortest route from node 1 to node 9. The main loop of the algorithm operates as follows:

  1. Begin by visiting each node starting from the vertex with the smallest known distance from the start node. Initially, this will be the origin (node 1).
  2. For the current vertex being visited, examine its neighboring nodes that haven't been visited yet.
  3. Calculate the distance from each unvisited neighboring node to the start node. For instance, starting from node 1, we would calculate the distances from 1 to 2 as 2, from 1 to 3 as 5, and from 1 to 4 as 2.
  4. If the calculated distance is shorter than the known distance, update the shortest distance.
  5. Update the previous vertex for each updated shortest distance.
  6. Add the current vertex to the list of visited vertices. This loop continues until all vertices have been visited.

Dijkstra_algorithm.png

Vehicle Routing Problem ΒΆ

Vehicle routing problem is a generic name given to a whole class of problems concerning the optimal design of routes to be used by a fleet of vehicles to serve a set of customers. This becomes more complex as we include more factors like vehicle capacity and driver availability. For example, in a delivery company, it asks for a determination of a set of routes, S, (one route for each vehicle that must start and finish at its own depot) such that all customers' requirements and operational constraints are satisfied and the objective function is minimized or maximized.

Different variants of Vehicle Routing Problems include:

  1. Capacitated Vehicle Routing Problem: Vehicles have limited carrying capacity for the goods they must deliver.
  2. Vehicle Routing Problem with Time Windows: Delivery locations have specific time windows within which deliveries must be completed.
  3. Vehicle Routing Problem with Pickup and Delivery: Goods need to be transported from pickup locations to delivery locations.
  4. Vehicle Routing Problem with Profits: Vehicles are not obligated to visit all nodes; the objective is to maximize the total collected profits.

With different business constraints and background The objective function can also be very different depending on the particular application of the result but a few of the more common objectives are:

  1. Minimize the global transportation cost based on the global distance travelled as well as the fixed costs associated with the used vehicles and drivers
  2. Minimize the number of vehicles needed to serve all customers
  3. Least variation in travel time and vehicle load
  4. Minimize penalties for low quality service
  5. Maximize a collected profit/score.

vehicle_travelling_problem.png

Here are some optimization algorithm commonly used in solving vehicle routing problem:

  1. Genetic Algorithms (GA): they search problems by mimicking the principles of evolution and natural selection observed in biological systems. They start with random strategies and solutions, assess their fitness, then allow the best ones to combine strategies, akin to how animals pass traits to offspring. Introducing randomness via mutation, like new traits arising in living things, adds diversity. This cycle repeats over generations, making solutions smarter and more optimized, similar to how species adapt over time to their environments.

  2. Neighborhood search algorithms: these algorithms work by iteratively exploring the neighborhood of a current solution to find improved solutions. It explore small changes (neighborhoods) from your current solution and moving towards better solutions within that neighborhood. They're great for problems where you want to find the best solution by making incremental improvements step by step.

  3. Simulated Annealing (SA): it is similar to neighborhood search algorithms, but it adds a "temperature" parameter. This temperature controls how much randomness is allowed in the search. Starting with a high temperature enables extensive exploration and random movements across the solution space. As the algorithm progresses through iterations, the temperature gradually decreases following a predefined cooling schedule. This cooling process makes the algorithm more deterministic, meaning it becomes less likely to accept worse solutions as it progresses. The algorithm stops either when it reaches the maximum number of iterations or when a termination condition is met.

  4. Ant Colony Optimization (ACO): it mimics ants' pheromone-based communication to find optimal paths. Artificial ants randomly explore solutions, choosing paths based on pheromone levels and heuristics. Higher pheromone levels and shorter paths are favored. ACO updates pheromone information iteratively, using historical data to guide exploration, unlike neighborhood search algorithms that focus narrowly on immediate solutions.

Traveling Salesman Problem ΒΆ

The Traveling Salesman Problem (TSP) is a specific type of routing vehicle problem that focuses on finding the shortest possible route for a salesman who needs to visit a list of cities exactly once and return to the starting city. The goal is to find the most efficient route that allows the salesman to visit every city exactly once and return to the city where they started their journey. This problem is crucial in logistics, transportation planning, and optimizing travel routes.

It's important to note that as the number of cities increases in the Traveling Salesman Problem, the sheer number of possible routes grows exponentially, posing a significant challenge for finding the optimal solution. To address this challenge, a range of optimization algorithms are employed, including heuristic approaches like nearest neighbor algorithms, genetic algorithms, simulated annealing, and ant colony optimization. These algorithms employ diverse strategies to progressively enhance the route by analyzing subsets of cities and assessing potential routes based on factors such as total distance or time.

salesman_problem.png

Traveling Purchaser Problem ΒΆ

The Traveling Purchaser Problem resembles the Traveling Salesman Problem but incorporates inventory availability along the route, in addition to distance and transportation costs. The optimization algorithms commonly used for vehicle routing problems are also applicable to solving this type of problem.

Dataset ΒΆ

A comprehensive collection of Starbucks locations worldwide, containing data on Street Address, Country, City, Longitude, and Latitude. For the sake of project simplicity, I will focus exclusively on stores located in San Francisco, CA. Below is a summary of the targeted store data.

data_info-2.png

Examples of Solving Each Problem ΒΆ

Import packages ΒΆ

InΒ [131]:
## for data processing
import pandas as pd
import numpy as np

## for plotting
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px

## interacitve map
import folium
from folium import plugins

## for simple routing
import osmnx as ox   # street map data
import networkx as nx   # graph object

## for advanced routing 
from ortools.constraint_solver import pywrapcp
from ortools.constraint_solver import routing_enums_pb2

import warnings

# Ignore all warnings
warnings.filterwarnings("ignore")
InΒ [173]:
df = pd.read_csv('dataset.csv')
df[(df["Country"]=='US') & (df["State/Province"]=='CA') & (df["City"]=='San Francisco')].info()
df = df[(df["Country"]=='US') & (df["State/Province"]=='CA')].groupby('City')['Store Number'].count().sort_values(ascending=False)
<class 'pandas.core.frame.DataFrame'>
Index: 89 entries, 14694 to 14783
Data columns (total 13 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   Brand           89 non-null     object 
 1   Store Number    89 non-null     object 
 2   Store Name      89 non-null     object 
 3   Ownership Type  89 non-null     object 
 4   Street Address  89 non-null     object 
 5   City            89 non-null     object 
 6   State/Province  89 non-null     object 
 7   Country         89 non-null     object 
 8   Postcode        89 non-null     object 
 9   Phone Number    87 non-null     object 
 10  Timezone        89 non-null     object 
 11  Longitude       89 non-null     float64
 12  Latitude        89 non-null     float64
dtypes: float64(2), object(11)
memory usage: 9.7+ KB

Preprocessing ΒΆ

InΒ [174]:
# read data and filter out LA stores
df = pd.read_csv('dataset.csv')
df = df[(df["Country"]=='US') & (df["State/Province"]=='CA') & (df["City"]=='San Francisco')][
        ["City","Street Address","Latitude","Longitude"]
      ].reset_index(drop=True)
df = df.reset_index().rename(
      columns={"index":"id", "Latitude":"y", "Longitude":"x"})

print(df.info())
df.head(3)
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 89 entries, 0 to 88
Data columns (total 5 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   id              89 non-null     int64  
 1   City            89 non-null     object 
 2   Street Address  89 non-null     object 
 3   y               89 non-null     float64
 4   x               89 non-null     float64
dtypes: float64(2), int64(1), object(2)
memory usage: 3.6+ KB
None
Out[174]:
id City Street Address y x
0 0 San Francisco 865 Market St., #324 37.78 -122.41
1 1 San Francisco 2018 Market Street 37.77 -122.43
2 2 San Francisco 120 4th Street 37.78 -122.40
InΒ [175]:
# set up the depot (the place that truck / salesman starts traveling)
i = 74
df["base"] = df["id"].apply(lambda x: 1 if x==i else 0)
base = df[df["base"]==1][["y","x"]].values[0]

print("base =", base)
df.head(3)
base = [  37.77 -122.43]
Out[175]:
id City Street Address y x base
0 0 San Francisco 865 Market St., #324 37.78 -122.41 0
1 1 San Francisco 2018 Market Street 37.77 -122.43 0
2 2 San Francisco 120 4th Street 37.78 -122.40 0
InΒ [176]:
# plot out the distribution of x and y
plt.scatter(y=df["y"], x=df["x"], color="black")
plt.scatter(y=base[0], x=base[1], color="red")
plt.show()
No description has been provided for this image
InΒ [177]:
# Plot stores on realistic map using folium package
# setup
def create_map():
    data = df.copy()
    color = "base"  #color based on this column
    lst_colors = ["black","red"]
    popup = ["id"] #popup based on this column
    lst_icon = ['glyphicon glyphicon-star-empty', 'glyphicon-home']

    # base map
    map_ = folium.Map(location=base, zoom_start=11)

    # add colors
    lst_elements = sorted(list(data[color].unique()))
    data["color"] = data[color].apply(lambda x: 
                    lst_colors[lst_elements.index(x)])
    data["icon"] = data[color].apply(lambda x: 
                    lst_icon[lst_elements.index(x)])

    # add popup
    data.apply(lambda row: 
        folium.Marker(
                location=[row["y"],row["x"]], tooltip=f"id: {row[popup].values[0]}, base?: {row['base']}"
                , fill=True, icon=folium.Icon(color=row["color"], icon=row["icon"])).add_to(map_), 
        axis=1)


    # Add different versions of map
    layers = ["cartodbpositron", "openstreetmap","CartoDB.Voyager"]

    for tile in layers:
        folium.TileLayer(tile).add_to(map_)
    folium.LayerControl(position='bottomright',collapsed=False).add_to(map_)
    return map_


# show
map_ = create_map()
map_
Out[177]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Shortest Path Problem ΒΆ

Creating Road Map ΒΆ

To get the shortest path from base to any location. We need to establish connections between these nodes using roads as links. OSMnx is a library that can query OpenStreetMap data, extract road details, convert them into links, and then transform this data into graphs.

The variable "G" in the code below represents a road network within a 20 km radius of the base point. This network aids in calculating the shortest travel time from the base to our destination. The graph visually represents the road network, with each circle denoting nodes or road intersections, and the lines indicating distance between nodes.

InΒ [178]:
# create network graph
G = ox.graph_from_point(base, dist=50000, 
        network_type="drive")  #'drive', 'bike', 'walk'
G = ox.add_edge_speeds(G)
G = ox.add_edge_travel_times(G)

# plot
fig, ax = ox.plot_graph(G, bgcolor="black", node_size=5, 
        node_color="white", figsize=(16,8))
No description has been provided for this image

Calculating Shortest Path ΒΆ

After obtaining the distance and time information, we can compute the shortest path from the origin (base) to the destination. In this case, I've set the destination to index 10. I experimented with two different objective functions: minimizing traveling distance and minimizing traveling time. Both optimizations were achieved using Dijkstra's algorithm.

The black and white map below illustrates the results of these shortest paths. The red line represents the optimal path when prioritizing traveling distance, while the blue line indicates the path with the shortest traveling time.

InΒ [179]:
# Example: setting id = 10 as destination
end = df[df["id"]==10][["y","x"]].values[0]
print("locations: from", base, "--> to", end)

# Determine the nearest node on the road map to both our base store and destination store
base_node = ox.distance.nearest_nodes(G, base[1], base[0])
end_node = ox.distance.nearest_nodes(G, end[1], end[0])
print("nodes: from", base_node, "--> to", end_node)

# calculate shortest path using Dijkstra’s algorithm
## prioritizing distance
path_lenght = nx.shortest_path(G, source=base_node, target=end_node, 
                                method='dijkstra', weight='lenght')     
print(path_lenght)

## prioritizing time
path_time = nx.shortest_path(G, source=base_node, target=end_node, 
                                method='dijkstra', weight='travel_time')     
print(path_time)

# plot on the graph and compare two different objective functions results
fig, ax = ox.plot_graph_routes(G, routes=[path_lenght, path_time], 
                              route_colors=["red","blue"], 
                              route_linewidth=5, node_size=1, 
                              bgcolor='black', node_color="white", 
                              figsize=(16,8))
locations: from [  37.77 -122.43] --> to [  37.8  -122.43]
nodes: from 65365163 --> to 65336112
[65365163, 65350533, 65332198, 65365166, 65327905, 65365168, 65310809, 65354073, 65325380, 65356398, 65287232, 65352349, 65308302, 65365172, 65365176, 65317971, 65338304, 258761269, 65314192, 65303573, 65319981, 65352464, 65295361, 65307440, 65336914, 65365195, 65337437, 65365236, 65333932, 65299426, 65345288, 65312706, 65312705, 65292764, 65292766, 65319050, 65319054, 65336112]
[65365163, 65318023, 65328181, 65328183, 65328185, 65328187, 65327904, 65328191, 65310806, 65306336, 65328196, 65328198, 65325378, 65324344, 342543655, 65328203, 65287225, 65328209, 65308300, 65328216, 65328224, 65328240, 65317967, 258760438, 258760426, 258760428, 2163049313, 65314189, 65303568, 342544910, 65319976, 65328348, 65295355, 65307438, 65328349, 65328351, 65328354, 65328358, 65328360, 65292758, 65292759, 4011962210, 65292764, 65312700, 65312698, 65322710, 65336112]
No description has been provided for this image

We can further visualize the optimal route by plotting it back onto the Folium map we created earlier in the previous section.

InΒ [180]:
# plot on the map
ox.plot_route_folium(G, route=path_lenght, route_map=map_, 
                     color="red", weight=1)
ox.plot_route_folium(G, route=path_time, route_map=map_, 
                     color="blue", weight=1)
map_
Out[180]:
Make this Notebook Trusted to load map: File -> Trust Notebook

We can also simulate the driver's route using Plotly.

InΒ [181]:
# Create route dataframe using the path that optimizing driving time (variable: path_time)
lst_start, lst_end = [],[]
start_x, start_y = [],[]
end_x, end_y = [],[]
lst_length, lst_time = [],[]

# Each row represents the route that the driver will pass between the nodes by sequence
for a,b in zip(path_time[:-1], path_time[1:]):
    lst_start.append(a)
    lst_end.append(b)
    lst_length.append(round(G.edges[(a,b,0)]['length']))
    lst_time.append(round(G.edges[(a,b,0)]['travel_time']))
    start_x.append(G.nodes[a]['x'])
    start_y.append(G.nodes[a]['y'])
    end_x.append(G.nodes[b]['x'])
    end_y.append(G.nodes[b]['y'])

route_simulate_df = pd.DataFrame(list(zip(lst_start, lst_end, 
                           start_x, start_y, end_x, end_y, 
                           lst_length, lst_time)), 
                   columns=["start","end","start_x","start_y",
                            "end_x","end_y","length","travel_time"]
                  ).reset_index().rename(columns={"index":"id"})

print(route_simulate_df.head())


## create start/end df 
df_base = route_simulate_df[route_simulate_df["start"] == base_node]
df_end = route_simulate_df[route_simulate_df["end"] == end_node]

## create basic map
fig = px.scatter_mapbox(data_frame=route_simulate_df, lon="start_x", lat="start_y", 
                        zoom=10, width=800, height=500, 
                        animation_frame="id", 
                        mapbox_style="carto-positron")
## add driver
fig.data[0].marker = {"size":12}
## add start point
fig.add_trace(px.scatter_mapbox(data_frame=df_base, 
                                lon="start_x", lat="start_y").data[0])
fig.data[1].marker = {"size":15, "color":"red"}
## add end point
fig.add_trace(px.scatter_mapbox(data_frame=df_end, 
                                lon="start_x", lat="start_y").data[0])
fig.data[2].marker = {"size":15, "color":"green"}
## add route
fig.add_trace(px.line_mapbox(data_frame=route_simulate_df, 
                             lon="start_x", lat="start_y").data[0])
fig
   id     start       end     start_x    start_y       end_x      end_y  \
0   0  65365163  65318023 -122.430084  37.770249 -122.429248  37.770355   
1   1  65318023  65328181 -122.429248  37.770355 -122.428445  37.770457   
2   2  65328181  65328183 -122.428445  37.770457 -122.428537  37.770927   
3   3  65328183  65328185 -122.428537  37.770927 -122.428629  37.771391   
4   4  65328185  65328187 -122.428629  37.771391 -122.428722  37.771849   

   length  travel_time  
0      74            7  
1      72            6  
2      53            4  
3      52            4  
4      52            4  

Traveling Salesman Problem ΒΆ

Using the nodes and links on the roadmap, we can delve into solving routing problems that involve passing through multiple stops and encompass more complex business constraints. I suggest beginning with the Traveling Salesman Problem, as it simplifies the business settings with just one loop and a single driver. Our goal is to minimize travel time as the objective function.

Initially, I create a distance matrix that includes the shortest travel times between nodes representing the stores. Subsequently, I employ OR-Tools to solve the problem. OR-Tools is a Python package developed by Google for solving linear programming, vehicle routing, and various related optimization problems.

Creating Distance Matrix ΒΆ

I map the nodes from the roadmap back to the store information table and remove duplicate nodes. Additionally, I fill the null values in the distance matrix with row means and overall means.

InΒ [182]:
# get the node from the roadmap for each store location
df_salesman = df.copy()
df_salesman["node"] = df_salesman[["y","x"]].apply(lambda x: 
                           ox.distance.nearest_nodes(G, x[1], x[0]), 
                        axis=1)
# some stores might share the same nodes
# drop duplicate nodes so that we won't pass by same location multiple times
df_salesman = df_salesman.drop_duplicates("node", keep='first')
df_salesman.info()
<class 'pandas.core.frame.DataFrame'>
Index: 38 entries, 0 to 85
Data columns (total 7 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   id              38 non-null     int64  
 1   City            38 non-null     object 
 2   Street Address  38 non-null     object 
 3   y               38 non-null     float64
 4   x               38 non-null     float64
 5   base            38 non-null     int64  
 6   node            38 non-null     int64  
dtypes: float64(2), int64(3), object(2)
memory usage: 2.4+ KB
InΒ [183]:
# Create distance matrix for later optimization inputs
## distance length function
def f(a,b):
    try:
        d = nx.shortest_path_length(G, source=a, target=b, 
                                    method='dijkstra', 
                                    weight='travel_time')
    except:
        d = np.nan
    return d


## apply the function
distance_matrix = np.asarray([[f(a,b) for b in df_salesman["node"].tolist()] 
                               for a in df_salesman["node"].tolist()])
distance_matrix = pd.DataFrame(distance_matrix, 
                               columns=df_salesman["node"].values, 
                               index=df_salesman["node"].values)
distance_matrix.head()
Out[183]:
65319536 65365163 65335256 65308300 4062847020 65307198 65298939 258911631 6311883023 1723739241 ... 65352339 65373364 65342609 65292743 65322686 65317190 65350725 6424341847 65342677 65288870
65319536 0.0 189.3 104.3 191.5 716.5 722.5 518.0 558.7 185.6 NaN ... 112.0 212.6 473.0 279.7 260.6 422.7 553.5 214.6 567.9 380.3
65365163 211.7 0.0 265.4 90.8 555.0 578.7 380.7 438.4 281.1 NaN ... 155.3 381.4 375.0 186.2 345.2 324.7 394.2 310.2 424.1 228.4
65335256 144.2 258.7 0.0 299.7 677.7 821.7 479.2 664.6 85.0 NaN ... 240.1 195.8 504.6 318.4 279.2 433.0 514.7 114.4 667.1 441.9
65308300 199.9 106.2 261.6 0.0 659.2 609.9 483.9 394.2 335.4 NaN ... 74.4 348.0 284.2 95.4 254.4 233.9 498.4 364.5 455.3 321.0
4062847020 728.0 554.8 727.4 638.1 0.0 267.6 422.3 492.2 695.9 NaN ... 702.6 854.4 728.6 733.5 892.5 787.5 265.2 663.8 299.4 396.5

5 rows Γ— 38 columns

InΒ [184]:
# Check null values
heatmap = distance_matrix.copy()
for col in heatmap.columns:
    heatmap[col] = heatmap[col].apply(lambda x: 
                       0.3 if pd.isnull(x) else  #nan -> purple
                      (0.7 if np.isinf(x) else   #inf -> orange
                      (0 if x!=0 else 1) ))      # 0  -> white  

fig, ax = plt.subplots(figsize=(10,5))
sns.heatmap(heatmap, vmin=0, vmax=1, cbar=False, ax=ax)
plt.show()

# Fill null values with row mean and total mean
# fillna with row average
distance_matrix = distance_matrix.T.fillna(distance_matrix.mean(axis=0)).T

# fillna with overall average
distance_matrix = distance_matrix.fillna(distance_matrix.mean().mean())
No description has been provided for this image
InΒ [185]:
# Check null values after filling na
heatmap = distance_matrix.copy()
for col in heatmap.columns:
    heatmap[col] = heatmap[col].apply(lambda x: 
                       0.3 if pd.isnull(x) else  #nan -> purple
                      (0.7 if np.isinf(x) else   #inf -> orange
                      (0 if x!=0 else 1) ))      # 0  -> white  

fig, ax = plt.subplots(figsize=(10,5))
sns.heatmap(heatmap, vmin=0, vmax=1, cbar=False, ax=ax)
plt.show()
No description has been provided for this image

Calculting TSP Optimal Path ΒΆ

First, establish the business parameters, noting that we have only one driver available. Subsequently, initialize a routing index manager with details about the nodes, drivers, and base node. This manager is then used to create a routing model, which serves as the basis for applying optimization algorithms to determine optimal routes for vehicles or comparable routing scenarios.

InΒ [186]:
# Business settings
drivers = 1

lst_nodes = df_salesman["node"].tolist()
print("start:", base_node, "| tot locations to visit:", 
     len(lst_nodes)-1, "| drivers:", drivers)
start: 65365163 | tot locations to visit: 37 | drivers: 1
InΒ [187]:
manager = pywrapcp.RoutingIndexManager(len(lst_nodes), 
                                       drivers, 
                                       lst_nodes.index(base_node))
model = pywrapcp.RoutingModel(manager)

Next, set up a callback function (get_distance) to dynamically compute distances between nodes based on a distance matrix. It then registers this callback with the routing model and configures the model to use this callback to evaluate arc costs for all vehicles during the optimization process.

InΒ [188]:
# Get the traveling time from node to node
def get_distance(from_index, to_index):
    return distance_matrix.iloc[from_index,to_index]

distance = model.RegisterTransitCallback(get_distance)
model.SetArcCostEvaluatorOfAllVehicles(distance)

Then, set up routing search parameters with a specific first solution strategy, which influences how OR-Tools explores the search space and finds an initial solution to the routing problem. Here, we use "PATH_CHEAPEST_ARC", a strategy that prioritizes the selection of the most economical edge at each stage of route construction. This strategy is similar the the nearest neighbor search algorothm which primarily emphasizes information about nearby roads or edges.

InΒ [189]:
# Searching strategy settings1
parameters = pywrapcp.DefaultRoutingSearchParameters()
parameters.first_solution_strategy = (
          routing_enums_pb2.FirstSolutionStrategy.PATH_CHEAPEST_ARC  # select the cheapest edge at each search
)

Finally, we can start running the solver. The model.SolveWithParameters(parameters) initiates the optimization process to find an initial solution based on the strategy. Then, it iterates through the solution to construct the route for the driver, starting from the first node and moving along the path until reaching the end.

The loop retrieves the node indices along the route, calculates the total distance traveled using the get_distance function, and updates the route distance accordingly. Finally, the code prints out the route indices, the total time taken in hours, and the number of nodes visited in the route.

InΒ [190]:
# PATH_CHEAPEST_ARC solution
solution = model.SolveWithParameters(parameters)

index = model.Start(0)
print('Route for driver:')
route_idx, route_distance = [], 0
while not model.IsEnd(index):
    route_idx.append( manager.IndexToNode(index) ) 
    previous_index = index
    index = solution.Value( model.NextVar(index) )
    ### update distance
    try:
        route_distance += get_distance(previous_index, index)
    except:
        route_distance += model.GetArcCostForVehicle(
                              from_index=previous_index, 
                              to_index=index, 
                              vehicle=0)

print(route_idx)
print(f'Total Time: {round(route_distance/3600, 2)} hours')
print(f'Nodes visited: {len(route_idx)}')
Route for driver:
[1, 37, 36, 35, 34, 33, 32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 0]
Total Time: 4.26 hours
Nodes visited: 38

Likewise, we can overlay the optimal route onto the map generated in the preceding section. The blue line on the map signifies the driver's path to visit all stores. Although we have the option to use Plotly for simulating the trip, I prefer to omit that step for the sake of simplicity.

InΒ [191]:
# Map the salesman's route back to the map
## Get path between nodes
def get_path_between_nodes(lst_route):
    lst_paths = []
    for i in range(len(lst_route)):
        try:
            a, b = lst_nodes[i], lst_nodes[i+1]
        except:
            break
        try:
            path = nx.shortest_path(G, source=a, target=b, 
                                    method='dijkstra', 
                                    weight='travel_time')
            if len(path) > 1:
                lst_paths.append(path)
        except:
            continue
    return lst_paths

lst_paths = get_path_between_nodes(route_idx)
map_ = create_map()

# Add paths on the map
for path in lst_paths:
    ox.plot_route_folium(G, route=path, route_map=map_, 
                         color="blue", weight=1)
map_
Out[191]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Vehicle Routing Problem ΒΆ

Lastly, I would like to deal with Vehicle Routing Problem which involve more constraints and more complexity. Here, I would like to introduce limitations for the drivers in both carrying capacity and coverable distance. In this example, we can assume that we have 5 drivers, each capable of driving for 8 hours per day and restricted to visiting a maximum of 15 stores daily. Similarly, we also want to minimize total traveling time.

Calculating VRP Optimal Path ΒΆ

General SettingsΒΆ

First, specify the number of drivers available, the capacity of each driver, the total demand (total number of nodes to be visited), and the maximum allowable travel time.

InΒ [192]:
## Business parameters and constraints
drivers2 = 3
driver_capacities = [13,13,13]
demands = [0] + [1]*(len(lst_nodes)-1) # different store could have different demand but here we assume all = 1
max_time = 2 * 60 * 60  # drive 3 hours a day

Similarly, we create a manager object to track progress, a model to establish the simulation environment, and a distance function for computing costs.

InΒ [193]:
## new manager and model
manager2 = pywrapcp.RoutingIndexManager(len(lst_nodes), 
                                       drivers2, 
                                       lst_nodes.index(base_node))
model2 = pywrapcp.RoutingModel(manager2)

## add traveling time (cost)
def get_distance(from_index, to_index):
    return distance_matrix.iloc[from_index,to_index]

distance2 = model2.RegisterTransitCallback(get_distance)
model2.SetArcCostEvaluatorOfAllVehicles(distance2)

Setting ConstraintsΒΆ

Different from the traveling salesman problem, we need to add more contraints to the model.

InΒ [194]:
## add capacity constraint
def get_demand(from_index):
    return demands[from_index]

demand = model2.RegisterUnaryTransitCallback(get_demand) # provide the capacity requirement for each node (store)
model2.AddDimensionWithVehicleCapacity(demand, slack_max=0,  # Sets the maximum extra capacity for each vehicle.
                                     vehicle_capacities=driver_capacities, 
                                     fix_start_cumul_to_zero=True, # Cumulative demand starts at zero for each vehicle's route
                                     name='Capacity')

# add traveling time constraint
name = 'Distance'
model2.AddDimension(distance2, slack_max=0, capacity=max_time, # restrict the maximum distance traveled by each vehicle
                   fix_start_cumul_to_zero=True  
                   , name=name)
distance_dimension = model2.GetDimensionOrDie(name)

# sets a cost coefficient for the distance dimension
# influence how the model prioritizes minimizing travel distances while optimizing routes
distance_dimension.SetGlobalSpanCostCoefficient(100)

Here I use the same policy as Traveling Salesman Problem, choosing the cheapest edge near the node.

InΒ [195]:
# set strategy to minimize traveling time
parameters2 = pywrapcp.DefaultRoutingSearchParameters()
parameters2.first_solution_strategy = (
          routing_enums_pb2.FirstSolutionStrategy.PATH_CHEAPEST_ARC
)
solution2 = model2.SolveWithParameters(parameters2)

After setting the environment and constrains, we can start solving the problem. It iterate through each driver's route, starting from their designated starting point and ending when the driver completes their route. Within the loop, the code tracks the route indices, distance traveled, and load carried by each driver. The get_distance function or the model's method model2.GetArcCostForVehicle is used to calculate the distance traveled between nodes, and the demands dictionary is accessed to determine the load carried by each node.

Finally, we can get the route indices, traveling time, and load for each driver, as well as the total traveling time and load for all drivers combined.

InΒ [196]:
dic_routes_idx, total_distance, total_load = {}, 0, 0
for driver in range(drivers2):
    print(f'Route for driver {driver}:')
    index = model2.Start(driver)
    route_idx, route_distance, route_load = [], 0, 0
    while not model2.IsEnd(index):
        node_index = manager2.IndexToNode(index)
        route_idx.append( manager2.IndexToNode(index) )
        previous_index = index
        index = solution2.Value( model2.NextVar(index) )
        ### update distance
        try:
            route_distance += get_distance(previous_index, index)
        except:
            route_distance += model2.GetArcCostForVehicle(
                                from_index=previous_index, 
                                to_index=index, 
                                vehicle=driver)
        ### update load
        route_load += demands[node_index]
        
    route_idx.append( manager2.IndexToNode(index) )
    print(route_idx)
    dic_routes_idx[driver] = route_idx
    print(f'traveling time: {round(route_distance/3600,2)} hours')
    print(f'load: {round(route_load,2)}', "\n")
    total_distance += route_distance
    total_load += route_load
    
print(f'traveling time: {round(total_distance/3600,2)} hours')
print(f'Total load: {total_load}')
Route for driver 0:
[1, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
traveling time: 1.19 hours
load: 11 

Route for driver 1:
[1, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 1]
traveling time: 1.36 hours
load: 14 

Route for driver 2:
[1, 37, 36, 35, 34, 33, 32, 31, 30, 29, 28, 27, 26, 25, 0, 1]
traveling time: 1.42 hours
load: 14 

traveling time: 3.97 hours
Total load: 39

We can also visualize the path in the map.

InΒ [200]:
# Convert from idx to nodes
dic_route = {}
for k,v in dic_routes_idx.items():
    print(f"Route for driver {k} (nodes):")
    dic_route[k] = [lst_nodes[i] for i in v]
    print(dic_route[k], "\n")

# Get path between nodes
dic_paths = {k:get_path_between_nodes(v) for k,v in dic_route.items()}
map_ = create_map()

# Add paths on the map
lst_colors = ["green","red","blue"]

for k,v in dic_paths.items():
    for path in v:
        ox.plot_route_folium(G, route=path, route_map=map_, 
                             color=lst_colors[k], weight=round(5/(k+1),0), opacity=0.5)
                       color=lst_colors[2], weight=0.5, opacity=0.5)
map_
Route for driver 0 (nodes):
[65365163, 6311848840, 65336112, 1723739241, 6311883023, 258911631, 65298939, 65307198, 4062847020, 65308300, 65335256, 65365163] 

Route for driver 1 (nodes):
[65365163, 434685829, 65319946, 65283764, 65303726, 11834834867, 65287649, 65340480, 65336921, 65312219, 65299283, 65327169, 65290299, 65340931, 65365163] 

Route for driver 2 (nodes):
[65365163, 65288870, 65342677, 6424341847, 65350725, 65317190, 65322686, 65292743, 65342609, 65373364, 65352339, 65308466, 65298596, 65321070, 65319536, 65365163] 

Out[200]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Reference ΒΆ

  • Google OrTools Document
  • Folium Documentation
  • Modern Route Optimization with Python
  • Vehicle routing problem
  • Starbucks Locations Worldwide